INTERSPEECH 2006 1 sing Dominance Functions and udio - Visual Speech Synthesis

نویسنده

  • Jakub Kanis
چکیده

This paper presents results of training of coarticulation models for Czech audio-visual speech synthesis. Two approaches for solution of coarticulation in audio-visual speech synthesis were used, coarticulation based on dominance functions and visual unit selection. For both approaches, coarticulation models were trained. Models for unit selection approach were trained by visualy clustered data. These data were obtained using decision tree algorithm. Outputs of audio-visual speech synthesis for both approaches were assessed and compared objectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatar

This paper describes our initial work in developing a real-time audio-visual Chinese speech synthesizer with a 3D expressive avatar. The avatar model is parameterized according to the MPEG-4 facial animation standard [1]. This standard offers a compact set of facial animation parameters (FAPs) and feature points (FPs) to enable realization of 20 Chinese visemes and 7 facial expressions (i.e. 27...

متن کامل

Acoustic and Visual Analysis of Expressive Speech: A Case Study of French Acted Speech

Within the framework of developing an expressive audiovisual speech synthesis, an acoustic and visual analysis of expressive acted speech is proposed in this paper. Our purpose is to identify the main characteristics of audiovisual expressions that need to be integrated during synthesis to provide believable emotions to the virtual 3D talking head. We conducted a case study of a semi-profession...

متن کامل

Modeling the acoustic correlates of expressive elements in text genres for expressive text-to-speech synthesis

This paper proposes a novel approach for describing the expressive elements in text genres and modeling their acoustic correlates for expressive text-to-speech synthesis (TTS). We apply the three-dimensional PAD (pleasure-displeasure, arousal-nonarousal and dominance-submissiveness) model in describing expressivity. In particular, we define a set of principles for annotating the P and A values ...

متن کامل

Czech Audio-Visual Speech Synthesis with an HMM-trained Speech Database and Enhanced Coarticulation

The task of visual speech synthesis is usually solved by concatenation of basic speech units selected from a visual speech database. Acoustical part is carried out separately using similar method. There are two main problems in this process. The first problem is a design of a database, that means estimation of the database parameters for all basic speech units. Second problem is a way how to co...

متن کامل

An expandable web-based audiovisual text-to-speech synthesis system

The authors propose a framework for audiovisual speech synthesis systems [1] and present a first implementation of the framework [2], which is called MASSY Modular Audiovisual Speech SYnthesizer. This paper describes how the audiovisual speech synthesis system, the ‘talking head’, works, how it can be integrated into web-applications, and why it is worthwhile using it. The presented application...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006